477 research outputs found
Pregelix: Big(ger) Graph Analytics on A Dataflow Engine
There is a growing need for distributed graph processing systems that are
capable of gracefully scaling to very large graph datasets. Unfortunately, this
challenge has not been easily met due to the intense memory pressure imposed by
process-centric, message passing designs that many graph processing systems
follow. Pregelix is a new open source distributed graph processing system that
is based on an iterative dataflow design that is better tuned to handle both
in-memory and out-of-core workloads. As such, Pregelix offers improved
performance characteristics and scaling properties over current open source
systems (e.g., we have seen up to 15x speedup compared to Apache Giraph and up
to 35x speedup compared to distributed GraphLab), and makes more effective use
of available machine resources to support Big(ger) Graph Analytics
The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems
Now we live in an era of big data, and big data applications are becoming
more and more pervasive. How to benchmark data center computer systems running
big data applications (in short big data systems) is a hot topic. In this
paper, we focus on measuring the performance impacts of diverse applications
and scalable volumes of data sets on big data systems. For four typical data
analysis applications---an important class of big data applications, we find
two major results through experiments: first, the data scale has a significant
impact on the performance of big data systems, so we must provide scalable
volumes of data sets in big data benchmarks. Second, for the four applications,
even all of them use the simple algorithms, the performance trends are
different with increasing data scales, and hence we must consider not only
variety of data sets but also variety of applications in benchmarking big data
systems.Comment: 16 pages, 3 figure
Characterizing and Subsetting Big Data Workloads
Big data benchmark suites must include a diversity of data and workloads to
be useful in fairly evaluating big data systems and architectures. However,
using truly comprehensive benchmarks poses great challenges for the
architecture community. First, we need to thoroughly understand the behaviors
of a variety of workloads. Second, our usual simulation-based research methods
become prohibitively expensive for big data. As big data is an emerging field,
more and more software stacks are being proposed to facilitate the development
of big data applications, which aggravates hese challenges. In this paper, we
first use Principle Component Analysis (PCA) to identify the most important
characteristics from 45 metrics to characterize big data workloads from
BigDataBench, a comprehensive big data benchmark suite. Second, we apply a
clustering technique to the principle components obtained from the PCA to
investigate the similarity among big data workloads, and we verify the
importance of including different software stacks for big data benchmarking.
Third, we select seven representative big data workloads by removing redundant
ones and release the BigDataBench simulation version, which is publicly
available from http://prof.ict.ac.cn/BigDataBench/simulatorversion/.Comment: 11 pages, 6 figures, 2014 IEEE International Symposium on Workload
Characterizatio
Construction of CaF2-appended PVA nanofibre scaffold
In this work, a new material, calcium fluoride ( CaF2 )-appended poly(vinyl alcohol) (PVA) nanofibre scaffold, was prepared through electrospinning technique successfully. Scanning electron microscopy result showed that the morphology of the fibres was uniform and smooth, and the average diameter of the fibres was about 200 nm. Transmission electron microscopy results showed that many CaF2 nanoparticles were well dispersed in the PVA fibre matrix. The water-resistant ability of the scaffold was improved through intermolecular crosslinking of PVA by formaldehyde vapour. This novel material seems to be a promising scaffold for bone tissue engineering
Hierarchical-level rain image generative model based on GAN
Autonomous vehicles are exposed to various weather during operation, which is
likely to trigger the performance limitations of the perception system, leading
to the safety of the intended functionality (SOTIF) problems. To efficiently
generate data for testing the performance of visual perception algorithms under
various weather conditions, a hierarchical-level rain image generative model,
rain conditional CycleGAN (RCCycleGAN), is constructed. RCCycleGAN is based on
the generative adversarial network (GAN) and can generate images of light,
medium, and heavy rain. Different rain intensities are introduced as labels in
conditional GAN (CGAN). Meanwhile, the model structure is optimized and the
training strategy is adjusted to alleviate the problem of mode collapse. In
addition, natural rain images of different intensities are collected and
processed for model training and validation. Compared with the two baseline
models, CycleGAN and DerainCycleGAN, the peak signal-to-noise ratio (PSNR) of
RCCycleGAN on the test dataset is improved by 2.58 dB and 0.74 dB, and the
structural similarity (SSIM) is improved by 18% and 8%, respectively. The
ablation experiments are also carried out to validate the effectiveness of the
model tuning
- …